NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

The Geometry of Tree-Based Sorting

Blelloch, Guy E.; Dobson, Magdalen (May 2023, Dagstuhl Repository)

We study the connections between sorting and the binary search tree (BST) model, with an aim towards showing that the fields are connected more deeply than is currently appreciated. While any BST can be used to sort by inserting the keys one-by-one, this is a very limited relationship and importantly says nothing about parallel sorting. We show what we believe to be the first formal relationship between the BST model and sorting. Namely, we show that a large class of sorting algorithms, which includes mergesort, quicksort, insertion sort, and almost every instance-optimal sorting algorithm, are equivalent in cost to offline BST algorithms. Our main theoretical tool is the geometric interpretation of the BST model introduced by Demaine et al. [18], which finds an equivalence between searches on a BST and point sets in the plane satisfying a certain property. To give an example of the utility of our approach, we introduce the log-interleave bound, a measure of the information-theoretic complexity of a permutation π, which is within a lg lg n multiplicative factor of a known lower bound in the BST model; we also devise a parallel sorting algorithm with polylogarithmic span that sorts a permutation π using comparisons proportional to its log-interleave bound. Our aforementioned result on sorting and offline BST algorithms can be used to show existence of an offline BST algorithm whose cost is within a constant factor of the log-interleave bound of any permutation π.
more » « less
Full Text Available
Parallel Nearest Neighbors in Low Dimensions with Batch Updates

Blelloch, Guy E.; Dobson, Magdalen (January 2022, 2022 Proceedings of the Symposium on Algorithm Engineering and Experiments (ALENEX))

We present a set of parallel algorithms for computing exact k-nearest neighbors in low dimensions. Many k-nearest neighbor algorithms use either a kd-tree or the Morton ordering of the point set; our algorithms combine these approaches using a data structure we call the zd-tree. We show that this combination is both theoretically efficient under common assumptions, and fast in practice. For point sets of size n with bounded expansion constant and bounded ratio, the zd-tree can be built in O(n) work with O(n^ε) span for constant ε < 1, and searching for the k-nearest neighbors of a point takes expected O(k log k) time. We benchmark our k-nearest neighbor algorithms against existing parallel k-nearest neighbor algorithms, showing that our implementations are generally faster than the state of the art as well as achieving 75x speedup on 144 hyperthreads. Furthermore, the zd-tree supports parallel batch-dynamic insertions and deletions; to our knowledge, it is the first k-nearest neighbor data structure to support such updates. On point sets with bounded expansion constant and bounded ratio, a batch-dynamic update of size k requires O(k log n/k) work with O(k^ε + polylog(n)) span.
more » « less
Poster: The Problem-Based Benchmark Suite (PBBS), V2

https://doi.org/10.1145/3503221.3508422

Anderson, Daniel; Blelloch, Guy E.; Dhulipala, Laxman; Dobson, Magdalen; Sun, Yihan (March 2022, Proceedings of the 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming)

The Problem-Based Benchmark Suite (PBBS) is a set of benchmark problems designed for comparing algorithms, implementations and platforms. For each problem, the suite defines the problem in terms of the input-output relationship, and supplies a set of input instances along with input generators, a default implementation, code for checking correctness or accuracy, and a timing harness. The suite makes it possible to compare different algorithms, platforms (e.g. GPU vs CPU), and implementations using different programming languages or libraries. The purpose is to better understand how well a wide variety of problems parallelize, and what techniques/algorithms are most effective. The suite was first announced in 2012 with 14 benchmark problems. Here we describe some significant updates. In particular, we have added nine new benchmarks from a mix of problems in text processing, computational geometry and machine learning. We have further optimized the default implementations; several are the fastest available for multicore CPUs, often achieving near perfect speedup on the 72 core machine we test them on. The suite now also supplies significantly larger default test instances, as well as a broader variety, with many derived from real-world data.
more » « less

Search for: All records